智能论文笔记

Online Data Selection for Federated Learning with Limited Storage

Chen Gong , Zhenzhe Zheng , Fan Wu , Bingshuai Li , Yunfeng Shao , Guihai Chen

分类：机器学习

2022-09-01

机器学习模型已在移动网络中部署，以处理来自不同层的数据，以实现自动化网络管理和设备的智能。为了克服集中式机器学习的高度沟通成本和严重的隐私问题，已提出联合学习（FL）来实现网络设备之间的分布式机器学习。虽然在FL中广泛研究了计算和通信限制，但仍未探索设备存储对FL性能的影响。如果没有有效有效的数据选择政策来过滤设备上的大量流媒体数据，经典FL可能会遭受更长的模型训练时间（超过$ 4 \ times $）和显着的推理准确性（超过$ 7 \％\％$），则遭受了损失，观察到了。在我们的实验中。在这项工作中，我们迈出了第一步，考虑使用有限的在设备存储的FL的在线数据选择。我们首先定义了一个新的数据评估度量，以在FL中进行数据选择：在设备数据样本上，局部梯度在所有设备的数据上投影到全球梯度上。我们进一步设计\ textbf {ode}，一个\ textbf {o} nline \ textbf {d} ata s \ textbf {e textbf {e} fl for f for fl f textbf {o}的框架，用于协作网络设备，以协作存储有价值的数据示例，并保证用于快速的理论保证同时提高模型收敛并增强最终模型精度。一项工业任务（移动网络流量分类）和三个公共任务（综合任务，图像分类，人类活动识别）的实验结果显示了ODE的显着优势，而不是最先进的方法。特别是，在工业数据集上，ODE的成就高达$ 2.5 \ times $ $加速的培训时间和6美元的最终推理准确性增加，并且在实践环境中对各种因素都有强大的态度。

translated by 谷歌翻译

A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

Chao Wen , Miao Xu , Zhilin Zhang , Zhenzhe Zheng , Yuhui Wang , Xiangyu Liu , Yu Rong , Dong Xie , Xiaoyang Tan , Chuan Yu

分类：人工智能

2021-06-11

在线广告中，自动竞标已成为广告商通过简单地表达高级活动目标和约束来优化其首选广告性能指标的重要工具。以前的作品从单个代理的视图中设计了自动竞争工具，而不会在代理之间建模相互影响。在本文中，我们从分布式多功能代理人的角度来看，请考虑这个问题，并提出一个常规$ \强调{m} $ ulti - $ \强调{a} $ gent加强学习框架，以便为$ clown {a} $ uto - $ \ Underline {b} $ IDDIND，即MAAB，了解自动竞标策略。首先，我们调查自动招标代理商之间的竞争与合作关系，并提出了一个温度定期的信用分配，以建立混合合作竞争范式。通过在代理商中仔细开展竞争和合作权衡，我们可以达到均衡状态，不仅担保个人广告商的实用程序，而且保证了系统性能（即社会福利）。其次，为避免竞争低价潜在勾结行为的合作，我们进一步提交了律师代理，为每位专家设定个性化招标酒吧，然后减轻由于合作而导致的收入退化。第三，要在大型广告系统中部署MAAB，我们提出了一种平均现场方法。通过将具有与平均自动竞标代理商相同的广告商进行分组，大规模广告商之间的互动大大简化，使得培训MAAB有效地培训。在离线工业数据集和阿里巴巴广告平台上进行了广泛的实验表明，我们的方法在社会福利和收入方面优于几种基线方法。

translated by 谷歌翻译

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models

Sucheng Ren , Fangyun Wei , Zheng Zhang , Han Hu

分类：计算机视觉

2023-01-03

Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.

translated by 谷歌翻译

Backdoor Attacks Against Dataset Distillation

Yugeng Liu , Zheng Li , Michael Backes , Yun Shen , Yang Zhang

分类：机器学习

2023-01-03

Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.

translated by 谷歌翻译

Cluster-guided Contrastive Graph Clustering Network

Xihong Yang , Yue Liu , Sihang Zhou , Siwei Wang , Wenxuan Tu , Qun Zheng , Xinwang Liu , Liming Fang , En Zhu

分类：机器学习

2023-01-03

Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.

translated by 谷歌翻译

Explaining Imitation Learning through Frames

Boyuan Zheng , Jianlong Zhou , Chunjie Liu , Yiqiao Li , Fang Chen

分类：机器学习 | 计算机视觉

2023-01-03

As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.

translated by 谷歌翻译

Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment

Liqun Lin , Yang Zheng , Weiling Chen , Chengdong Lan , Tiesong Zhao

分类：计算机视觉

2023-01-03

Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.

translated by 谷歌翻译

A New Perspective to Boost Vision Transformer for Medical Image Classification

Yuexiang Li , Yawen Huang , Nanjun He , Kai Ma , Yefeng Zheng

分类：计算机视觉 | 人工智能

2023-01-03

Transformer has achieved impressive successes for various computer vision tasks. However, most of existing studies require to pretrain the Transformer backbone on a large-scale labeled dataset (e.g., ImageNet) for achieving satisfactory performance, which is usually unavailable for medical images. Additionally, due to the gap between medical and natural images, the improvement generated by the ImageNet pretrained weights significantly degrades while transferring the weights to medical image processing tasks. In this paper, we propose Bootstrap Own Latent of Transformer (BOLT), a self-supervised learning approach specifically for medical image classification with the Transformer backbone. Our BOLT consists of two networks, namely online and target branches, for self-supervised representation learning. Concretely, the online network is trained to predict the target network representation of the same patch embedding tokens with a different perturbation. To maximally excavate the impact of Transformer from limited medical data, we propose an auxiliary difficulty ranking task. The Transformer is enforced to identify which branch (i.e., online/target) is processing the more difficult perturbed tokens. Overall, the Transformer endeavours itself to distill the transformation-invariant features from the perturbed tokens to simultaneously achieve difficulty measurement and maintain the consistency of self-supervised representations. The proposed BOLT is evaluated on three medical image processing tasks, i.e., skin lesion classification, knee fatigue fracture grading and diabetic retinopathy grading. The experimental results validate the superiority of our BOLT for medical image classification, compared to ImageNet pretrained weights and state-of-the-art self-supervised learning approaches.

translated by 谷歌翻译

Benchmarking the Robustness of LiDAR Semantic Segmentation Models

Xu Yan , Chaoda Zheng , Zhen Li , Shuguang Cui , Dengxin Dai

分类：计算机视觉

2023-01-03

When using LiDAR semantic segmentation models for safety-critical applications such as autonomous driving, it is essential to understand and improve their robustness with respect to a large range of LiDAR corruptions. In this paper, we aim to comprehensively analyze the robustness of LiDAR semantic segmentation models under various corruptions. To rigorously evaluate the robustness and generalizability of current approaches, we propose a new benchmark called SemanticKITTI-C, which features 16 out-of-domain LiDAR corruptions in three groups, namely adverse weather, measurement noise and cross-device discrepancy. Then, we systematically investigate 11 LiDAR semantic segmentation models, especially spanning different input representations (e.g., point clouds, voxels, projected images, and etc.), network architectures and training schemes. Through this study, we obtain two insights: 1) We find out that the input representation plays a crucial role in robustness. Specifically, under specific corruptions, different representations perform variously. 2) Although state-of-the-art methods on LiDAR semantic segmentation achieve promising results on clean data, they are less robust when dealing with noisy data. Finally, based on the above observations, we design a robust LiDAR segmentation model (RLSeg) which greatly boosts the robustness with simple but effective modifications. It is promising that our benchmark, comprehensive analysis, and observations can boost future research in robust LiDAR semantic segmentation for safety-critical applications.

translated by 谷歌翻译

Bayesian Generalized Kernel Inference for Exploration of Autonomous Robots

Yang Xu , Ronghao Zheng , Senlin Zhang , Meiqin Liu

分类：机器人

2023-01-02

This paper concerns realizing highly efficient information-theoretic robot exploration with desired performance in complex scenes. We build a continuous lightweight inference model to predict the mutual information (MI) and the associated prediction confidence of the robot's candidate actions which have not been evaluated explicitly. This allows the decision-making stage in robot exploration to run with a logarithmic complexity approximately, this will also benefit online exploration in large unstructured, and cluttered places that need more spatial samples to assess and decide. We also develop an objective function to balance the local optimal action with the highest MI value and the global choice with high prediction variance. Extensive numerical and dataset simulations show the desired efficiency of our proposed method without losing exploration performance in different environments. We also provide our open-source implementation codes released on GitHub for the robot community.

translated by 谷歌翻译